The Organization of Lookup Tables in Instruction Memoization
نویسندگان
چکیده
Instruction Memoization is a technique that enables performing multiple-cycle computations in a single cycle by exploiting redundant instructions. The technique is based on the notion of memoing: saving the input and output of previous calculations and using the output if the input is encountered again. Several recent papers have proven that non-trivial speedup (over 10%) can be obtained by using this technique. The speedup can be attributed to three major factors: (i) the percentage of instructions that can beneet from memoization, (ii) the integration of the the Lookup Table (LUT) into the datapath of the processor, and (iii) the percentage of successful lookups. This paper focuses on the latter factor and explores how the organization of a Lookup Table (LUT) innuences the percentage of successful lookups (hit-ratio). The best design for a LUT | which we call a memo-table | that yields the highest hit-ratio is described. The design chosen is based on experimental simulations that minimize the innuence of the other two factors of speedup. After the memo-table design is xed the rst two factors can be dealt with, without having to alter the 1 2 memo-table to accommodate changes in (i) the instruction mix of an application and (ii) changes in the microprocessors datapath. The characteristics of the memo-tables explored are size, set as-sociativity, indexing, replacement methods, and contents (instruction mix). Results of simulations have shown that the optimal design is to use 5 memo-tables each which holds a \family" of instructions. Each memo-table contains 256 entries in sets of 4. Entries are replaced randomly and are indexed by using a mix of bits from the exponent and mantissa of FP numbers or the Least Signiicant Bits of integers. Trivial operations are detected using dedicated circuitry and aren't entered into the memo-tables. Integrating the proposed design into the datapaths of the MIPS R10000 and PPC 604e results in an average 8-10% speedup for applications that heavily utilize multiple-cycle instructions.
منابع مشابه
Hardware Memoization of Mathematical and Trigonometric Functions
Memoization is saving the input and output of previous calculations and using the output if the input is encountered again. memo-tables are cache-like tables that store the operands and results of calculations that are candidates for memoization. A successful lookup gives the result of a multi-cycle computation in a single cycle, and a failed lookup doesn't necessitate a penalty in computation ...
متن کاملA High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure
The IP Lookup Process is a key bottleneck in routing due to the increase in routing table size, increasing traıc and migration to IPv6 addresses. The IP address lookup involves computation of the Longest Prefix Matching (LPM), which existing solutions such as BSD Radix Tries, scale poorly when traıc in the router increases or when employed for IPv6 address lookups. In this paper, we describe a ...
متن کاملA High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure
The IP Lookup Process is a key bottleneck in routing due to the increase in routing table size, increasing traıc and migration to IPv6 addresses. The IP address lookup involves computation of the Longest Prefix Matching (LPM), which existing solutions such as BSD Radix Tries, scale poorly when traıc in the router increases or when employed for IPv6 address lookups. In this paper, we describe a ...
متن کاملTemporal Memoization for Energy-Efficient Timing Error Recovery in GPGPU Architectures
Manufacturing and environmental variability lead to timing errors in computing systems that are typically corrected by error detection and correction mechanisms at the circuit level. The cost and speed of recovery can be improved by memoization-based optimization methods that exploit spatial or temporal parallelisms in suitable computing fabrics such as general-purpose graphics processing units...
متن کاملپیش یابی 60 ساله مشخصات امواج در جنوب خلیج فارس با استفاده از مدلSWAN به روش شبه زمانی
The goal of this study was to simulate the wave characteristics in the south of Persian Gulf over a 60 year period. Wind data from the study area including offshore measurement, satellite observation and numerical atmospheric models were collated and analyzed. The modified NCEP/NCAR wind field was chosen as input to the SWAN model which has the longest duration among wind data sources. SWAN mod...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000